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ABSTRACT 

Distinct multiple regression models are discussed, 
along with issues and problems related to their application to the 
practical problem of identifying effective schools. Statistical 
results obtained from the application of each model are discussed. 
The first model was implemented in 1984, and the second in 1992. In 
1984, it was decided to identify effective schools in terms of 
student achievement in the basic areas of reading, mathematics, and 
language usage. The emphasis was on standardized test scores. In 
1992-93, the outcomes used for effectiveness indices include: (1) a 
nationally normed standardized test; (2) a state-mandated criterion 
referenced tost with a writing sample; (3) 143 separate 
course-related criterion-referenced tests; (4) student promotion 
rate; (5) student graduation rate; (6) attendance rate; and (7) 
percentage of students taking the Scholastic Aptitude Test and 
average scores. In 1993-94, additional indices will be added. The 
effectiveness indices are an important part of the three-tier system 
of accountability being implemented in the Dallas (Texas) public 
schools. One table and one figure present analysis results. A 55-item 
list of references is included. Also provided is a description of the 
school performance improvement awards, a financial awards plan for 
effective schools. (SLD) 
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w One of the most important aspects of administering a large urban school district is the 

rH identification of schools that are unusually effective with the students that they serve Once 

00 effective sch00ls are identified, detailed studies can be launched to determine the concomitants 

of their effectiveness and those concomitants can be replicated in similar environments across the 
District. Researchers investigating effective schools have consistently identified five to seven 
factors that are correlated with improved school achievement (Good and Brophy, 1986; Purkey 
and Smith, 1983). These factors include a sense of mission (Brookover and Lezotte, 1979; Ohio 
Department of Education. 1981); strong building leadership (Edmonds. 1982; Shoemaker and 
Fraser, 1981); high expectations for students and staff (Clark. Lotto, and McCarthy, 1980 
Eubanks and Levine, 1983); frequent monitoring of student progress (Edmonds, 1982); a 
positive, orderly learning climate (Edmonds, 1982; Shoemaker and Fraser, 1981); sufficient 
opportunity for learning (Brookover, et al., 1979; Edmonds, 1982; MacKenzie, 1983); and 
parent/community involvement (Ohio Department of Education, 1981; Stedman, 1985). 

While the contribution of the effective schools movement has been substantive and such 
research will obviously be continued, there are a number of related questions, concerns and 
criticisms emerging in educational discourse. Specifically, concerns posed by a number of 
investigators include that the development of new techniques tor evaluating school effectiveness 
has not kept pace with the increased and continuing interest (Gelter, 1989; Webster and Olson 
1988); that most studies that have been done have focused upon narrow educational outcomes' 
typically norm-referenced achievement tests (Rowan. Bossert, and Dwyer, 1983; Stedman 1987 
1988); that the research has been primarily limited to elementary schools in urban systems with 
large populations of disadvantaged youth (Clark. Lotto, and McCarthy. 1980; Farrar Neufeld 
and Miles. 1983; Firestone and Herriott. 1982; Rowan. Bossert and Dwyer. 1983); and.'that most 
of the research attempting to associate school effects with student learning is correlational 
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meaning that causation has not generally been established (D'Amico, 1982; Neufeld, Farrar and 
Miles, 1983; Purkey and Smith, 1983). If research related to effective schools is to be advanced, 
new techniques for identifying and evaluating effective schools need to be developed (Saka, 
1989). Inherent in this task are two complex issues: (1) a better definition of school 
effectiveness, and, (2) the development of a model within which it can be assessed. 

Improved Definition 

In attempts to provide a better definition of effectiveness and responding to the narrowly 
focused concern of earlier effective schools research, Murnane (1987), David (1987), and others 
have been proponents for developing an expanded number of outcome indicators. In addition, 
Oakes (1989), David (1987), and Cohen (1986) have argued the importance of incorporating 
input and process/context indicators as important aspects of better accountability mechanisms. 
Possible input indicators often include school enrollment, socioeconomic/ethnic composition, 
proportion of limited English speaking children, enrollments in categorical programs, staff 
characteristics, and financial resources. Process indicators describe what is being taught, the way 
it is being taught, and include consensus on school goals, instructional leadership, opportunity to 
learn, school climate, staff development, and collegial interaction among teachers. Outcome 
indicators are usually related to capturing the results of school on students or providing 
information about other definitions of "good schooling," and may include student academic 
performance, teacher and student attendance rates, dropout and completion rates, performance of 
students at the next level of schooling, parent and student satisfaction, percent completing 
advanced courses, college attendance, and individual school goals (David, 1987; Oakes, 1989; 
Olson and Webster, 1990; Pollard, 1987; Shavelson, et al., 1987). 

It is essential to have widespread input from all constituencies when determining the 
attributes of good schooling. In Dallas, a 27 member task force, the Accountability Task Force, 
is charged with the responsibility of overseeing the accountability system. The membership 
includes four elementary teachers, three middle school teachers, four high school teachers, four 
principals, four parents, five members of the business community, and three central office 
administrators. In addition, the various employee organizations each have an ex officio member 
on the task force. This task force deals with many aspects of the accountability system including 
methodology, testing, determining and weighting important objectives of schooling, and 
determining the rules for financial awards that are related to the accountability system. The 
Accountability Task Force also hears any concerns or grievances relative to the accountability 
system. 
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Improved Assessment Mitels 

Unfortunately, at this point in time, State Departments of Education are the major actors 
in defining effective schools through the publication of various outcome variables that lend 
themselves to public comparison among schools and school districts. With the exception of 
some work done by the State Department of Education in South Carolina (May, 1990), most 
State accountability systems compare schools and school districts based on unadjusted outcome 
measures (Guskey and Kifer, 1990). This technique favors schools that serve advantaged 
students and usually adversely affects schools with population demographics that differ from the 
norm. A recent article by Jaeger (1992) graphically illustrated this point relative to ethnic 
background and SAT scores. The non-statistical technique of comparing schools with similar 
characteristics, also employed by a number of State Education Agencies, is one solution fo; cases 
involving a limited number of grouping characteristics (Nicoll, 1989; Felter, 1989). However, 
this approach assumes that grouping characteristics are relevant to the outcomes being reported 
and has serious limitations when there is consistent one-directional variance on the grouping 
characteristics within group (Webster and Edwards, 1993). 

A more appropriate approach to comparing schools and school districts for accountability 
purposes, and thereby defining effective schools, involves the use of one or more common 
statistical techniques to adjust outcomes based on relevant inputs. With these techniques, 
exceptional student performance is defined as measured performance that is significantly above 
or below that which would be expected if the school did no more, or no less, than maintain its 
students' rates of growth on the specified outcome variables or, alternatively, departed markedly 
from districtwide patterns. When a school's population of students departs markedly from its 
own pre-established trend, or from the more general pattern of similar students throughout the 
district, this departure can be attributed to school effect. School effects can therefore be 
identified by establishing expected outcome levels on the basis of student history or on the basis 
of the patterns of like students, then comparing actual student outcome levels with those 
empirically determined expectations. The most effective schools would be those that had the 
aggregate of students exceeding prediction the most. 

It is important to note that the goal of these techniques is to reliably and validly identify 
effective and ineffective schools. Causal relationships, particularly as they relate to process 
variables, can be thoroughly investigated once effective and ineffective schools have been 
identified. Schalock, Cowart and Staebler (1993), in discussing teacher productivity, state that a 
valid definition of teacher productivity must be predicated on productivity being the contribution 
to learning that a teacher makes rather than the absolute level of student learning. Productivity is 
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a relative term, not an absolute one. The same can be said for school productivity. Therefore, it 
is very possible for School A to be more productive than School B even though the absolute 
levels of pupil outcomes for School B are greater than those for School A. Absolute outcomes 
are a function of many inputs outside the influence of the school including student background 
characteristics, assistance and support for learning in the home and community, and student 
knowledge and skill. In identifying effective schools, the influence of known sources of student 
and school variation that are outside the control of the school must be eliminated to isolate 
school effect. 

A number of statistical models can be applied to this problem. One method of 
incorporating a large number of input, process, and outcome variables into an equation is 
multiple regression analysis (Aiken and West. 1991; Bano, 1985; Felter and Carlson, 1985; Kirst, 
1986; Klitgaard and Hall. 1973; McKenzie, 1983; Saka, 1989; Webster and Olson, 1988). In 
essence, the average deviation from the predicted values of the independent variables is 
determined for each school. Schools would then be ranked on the deviation. As a simple 
illustration, the mean score for an outcome measure such as attendance is predicted after 
considering such input variables as previous attendance, gender, ethnicity, socioeconomic status, 
and previous achievement level. The difference between predicted and actual attendance, a 
residual or adjusted score, can then be interpreted in a comparison with other school residuals as 
the school's own effect on attendance. Some advantages of multiple regression analysis over 
other statistical techniques for this application include its relative simplicity of application and 
interpretation, its robustness, and the fact that general methods of structuring complex regression 
equations to include combinations of categorical and continuous variables and their interactions 
are quite straightforward (Aiken and West, 1991; Cohen, 1968; Cohen and Cohen, 1975; 
Darlington, 1990). 

Another method of incorporating a large number of input, process, and outcome variables 
into an equation is canonical correlation (Van de Geer. 1971). The canonical correlation model 
calls for the use of canonical correlation to establish a linear combination of dependent and 
independent variables from which average deviation levels can be established for each school. In 
essence, the average deviation from the first canonical variate or first several variates for the 
dependent variables would be determined for each school. Schools would then be ranked on this 
deviation to produce the school effectiveness incices. In essence, for multilevel and student- 
level variables, each level of each variable would be standardized and the average value of the 
standardized levels computed. Average standardized scores for each level would be combined to 
produce a total standard score for the variable. Then, all dependent and independent variables 
would be used to establish the first canonical correlation for the data set. The values of the first 
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canonical variate for the linear combination of dependent variables would be determined for each 
school. Deviations from the district dependent variable portion of the first canonical variate 
would then be computed for each school. The deviations from the canonical variate would be 
used to rank schools for the school effectiveness indices. As in the other models, the deviations 
would be standardized. 

A third appropriate statistical method would be time-series analysis (Fuller, 1976; 
Nelson, 1973). Time series analysis calls for the use of equations to establish predicted scores 
for each dependent variable that has sufficient historical levels of data available. The average 
deviation from the predicted levels of the dependent variables would be determined for each 
school and schools would be ranked on this deviation. For multilevel and student-level 
variables, each level of each variable would be standardized and the average value of the 
standardized levels computed. Average standardized scores for each level would be combined to 
produce a total standard score for the variable. Then, a predicted value for each dependent 
variable would be computed. Deviations from the predicted values would then be computed for 
each school. The deviations would be averaged and the average deviation used to rank schools 
for the school effectiveness indices. Once again, the deviations would be standardized. 

Finally, hierarchical linear modeling was considered. Hierarchical linear modeling is an 
application of multiple regression analysis that provides different equations at different levels of 
observation. Education data are often hierarchical. Students are grouped into classes that are 
grouped into schools that are grouped into districts, etc. Hierarchical linear modeling takes this 
hierarchical structure into account and thus makes it possible to incorporate variables from all 
levels. Thus equations might be developed at the class, school, and district levels (Bryk and 
Raudenbush, 1992). 

In theory building, one must be concerned with the basic assumptions inherent in 
traditional linear model analysis, those being linearity (although regression equations can be 
adjusted to reflect non-linear relationships), normality, homoscedasticity, and independence. In 
applications where the entire population is involved and one is using multiple regression analysis 
as a descriptive technique, the requirements of independence of individual observations as well 
as of normality have no practical impact. In fact, individual observations are being used to 
determine school effect so that non-independence of student observations within classroom and 
school is what is being sought. Linearity and homoscedasticity are still important statistical 
issues and must be dealt with. 

The Accountability Task Force, given these various options, chose multiple regression 
analysis. Canonical correlation was not chosen because the Task Force wanted to be able to 
differentially weight variables by perceived level of importance. Time series analysis was not 
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the technique of choice because it required at least three years of longitudinal data and, in an 
urban school district, population mobility is such that these requirements would have had major 
impact on the degrees of freedom of the equations. Hierarchical linear modeling was not chosen 
because of degrees of freedom problems in the within school equations and the fact that the 
authors were not concerned with explaining school effects, merely with validly identifying 
effective schools. For this reason, path analysis, another explanatory technique, was not 
considered. 

This paper discusses several distinct multiple regression models and the issues and 
problems related to their application to the practical problem of identifying effective schools. 
The results obtained from the application of each of these models are discussed. The first model 
was implemented in 1984, the second in 1992. 



METHOD 

Effective Schools Methodology - 1984 

When faced with the challenge of identifying effective schools in 1984, it was decided to 
identify effective schools in terms of student achievement in the basic skill areas of reading, 
mathematics, and language usage (Webster and Olson, 1988). While many other desirable goals 
and outcomes of public education could have been easily enumerated, at that time heightened 
public awareness was focused on standardized test scores. A school was not considered 
successful unless its student test scores improved. 

Schppl Effects . In these early studies a school's effectiveness was associated with 
exceptional student achievement, defined as measured test performance above or below that 
which would be expected if a school did no more or no less than simply maintain students- 
previous rates of achievement growth. In general it is reasonable to expect students to continue 
to achieve at a given rate. When a school's population of students departed markedly from its 
own pre-established trend or from the more general trend of similarly achieving students 
throughout the district, this departure was attributed to a school effect. The problem of 
measuring a school's effect, then, becomes one of establishing the scnool's students' rates of 
achievement, setting expected levels of performance based upon these rates, and determining the 
extent to which its students, on the average, exceeded or fell short of expectation. Essentially, 
this is the same problem as establishing a srhool's average level of achievement after controlling 
for its students' previous levels of achievement. The procedures involved regression analysis to 
compute prediction equations by grade level and skill area independent of school identification 
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and then using these equations within schools to obtain mean gains over expectations. The 
individual student was the unit of analysis. 

Advantages of rhisAppmnrh . There are a number of advantages that can be enumerated 
for this approach. First, it controlled for systematic influences governing the student 
composition of schools. Since, as many studies have shown, test score performance is highly 
correlated with student background, demographic, and environmental factors, controlling on 
previous test score performance indirectly controls for these other variables as well. Second, it 
provided all schools an equal opportunity to demonstrate success. Schools need only focus upon 
accelerating their own students' rates of achievement; they need not compete with each other in 
terms of absolute achievement levels. Thus, schools derived no particular advantage by being 
composed of higher- or lower-ability students. Third, since all schools are allocated resources 
on the basis of specific formulae (schools having similar needs are provided similar resources) 
the procedure was sensitive to differences in the way resources were managed. Finally, the 
approach was in consonance with many practitioners' views of what constituted an effective 
school. 

Lir "totions . The major limitation of this approach was that it focused on standardized 
test performance. Certainly, learning and achievement occur in areas not measured by these 
tests, and in areas where the tests are less sensitive. While the equations developed through this 
approach proved to be very efficient and had a great deal of face validity, limiting the outcomes 
to standardized test results did not present a complete picture of the legitimate products of 
schooling. Standardized tests have been faulted in the literature for being insensitive to 
curricular content, focusing too much on recall of information (Frederickson. 1984), being biased 
in favor of more advantaged students (Guskey and Kifer, 1990). and for not measuring "real" 
achievement (Archibald and Newman, 1988). Kreft (1987) has argued that, since standardized 
tests are designed to distinguish between student, and not between contexts, new tools are 
needed to yield more valid assessments of school outcomes. Thus, a design for effectiveness 
indices should contain an array of outcome measures in addition to standardized test results 
Nevertheless, it must be realized that much of the recent public attention directed at the status of 
achievement in the nation's schools is really directed at the results of standardized testing 
programs. In the current educational environment, no matter what schools do to affect other 
outcomes, their efforts will be recognized largely to the extent that they affect standardized test 
scores. This was more true in 1984 than it is today. 

A second limitation of this approach is not really a limitation, but rather a misperception 
of the method. The equations used three years of student achievement history in predicting 
student outcomes. The individual student growth curves carried with them important 
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information about student background variables. It was demonstrated empirically that there was 
consistently no correlation between background variables such as ethnicity, free or reduced 
lunch, and gender, and school rankings by the equations. Nonetheless, the concern among 
practitioners that these variables were not accounted for in the equations continued, in spite of 
evidence to the contrary. 

Finally, the issue of data aggregation is important. In the early school effectiveness 
studies, a single effectiveness index was derived. This necessarily involved several levels of data 
aggregation. Thus, within schools student test scores were aggregated by subtest (reading, math, 
or language) within grade level to form component subject area school effectiveness indices. 
These were then aggregated across subtests to form grade level school effectiveness indices. 
Finally, the grade level school effectiveness indices were aggregated to form a single, 
schoolwide school effectiveness index. In examining intermediate results it was apparent that 
these steps often masked important effects among the components comprising the highest level 
of aggregation. For instance, in several instances a school's high (or low) rank could be traced 
to a particularly outstanding (positive or negative) effect at a single grade level in one subject 
area. Similar findings have been reported elsewhere (Abalos, Jolley, and Johnson, 1985; 
Helmstadter and Walton. 1985; Mandeville, 1988; Mandeville and Anderson, 1987) and cast 
doubt on the feasibility of aggregating school effectiveness indices over grade levels and subject 
areas without conscious thought as to the relative importance of each outcome variable. 

Methodology - The first phase of this study involved computing 36 prediction equations, 
one for each possible combination of tests, subtests, and grade levels. In each equation the 
criterion was the spring '83 subtest score and the predictors (one at grade 2; two elsewhere) were 
the previous years' subtest scores. The accuracy of prediction was assessed by the standard 
errors of estimate, the indices of forecasting efficiency, and the multiple coefficients of 
determination (R 2 ). r2' s varied from a low of .28 for second grade mathematics to a high of .87 
for eighth grade language. Most r2' s were .70 and above. Prediction generally improved with 
increasing grade level suggesting that, at higher grade levels, schools have a lesser opportunity to 
show a differential effect on student achievement as measured by standardized test scores. Th . 
use of three years of historical data on each student did not significantly improve prediction but 
did significandy reduce degrees of freedom in the equations. 

The purpose of the next phase of the study, prediction and estimation, was to compute an 
aggregate measure of each school's actual performance with respect to its expected performance, 
the aggregation to be taken over students, grade levels, and tests. This phase involved several 
steps. 
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First, expected scores and differences between expected scores and actual scores for 
students within schools were computed. This was done separately by subtest within grade level 
and then aggregated over subtests and grade levels within schools. 
For computing the individual difference scores, let: 

Y sgi be the s P rin S' 198 -* g™de equivalent score for individual i at 
grade level g on subtests, and 

X lsgi' X 2sgi be the S P dn g- 1981 and SP'^ng. 1982 subtest scores for the 
same individual. 

Then the expected 1983 score on s for each individual was given by the linear 
function: 

A 

Y sgi = ft (X lsgi. X 2sgP where the function, ( { was the prediction equation 

corresponding to the appropriate grade level, subtest, and battery (t) classification. The equations 
appeared as follows: 

Y sgin = V b » X isgin-l +b 2X 2 sgin-2- 

Where: 

A 

Y sgin = P redicted outcome variable in year n for individual i at grade 
level g on subtest s. 

b 0 = the constant 
h j = the beta weight for year n- 1 
b2 = the beta weight for year n-2 
Individual difference scores were then computed as 
ds & = Y sgi- Y sgi 

The d sgi eventually were to be aggregated over subtests and grade levels within schools. 
However, within a group to be aggregated, both the reliability and the scale of the individual d sg i 
varied depending upon the particular function used to compute Y sgi as well as the individual's 
location in the domain of predictors. Expectations at higher grade levels were generally more 
reliable than expectations at lower grade levels and the expectations for individuals close to the 
centroid of the predictors were more reliable than those for individuals further away. 

To correct for differences in scale and reliability, the individual d j were standardized by 

dividing by their individual standard errors of prediction to yield individually standardized 
scores. 

After d S gi were computed on each subtest for each individual in every school, they were 
then aggregated across subtests and grade levels within each school. A lower bound statistic was 
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then computed by subtracting one standard error of the mean from the mean individually 
standardized residual score. This lower bound statistics, LBd, was then used to rank the schools. 
The lower bound statistic allowed probability statements to be made regarding the placement of 
individual schools within the ranked distributions. Rankin? was done separate^ within K-3, 4-6, 
K-6, 7-8 and 9-12 grade configurations. 

In these early studies, the authors did not directly address the classic assumptions of the 
linear model (linearity, normality, homoscedasticity, independence). Instead, the equations were 
empirically validated through a series of studies that examined the extent of bias in the results 
toward schools with differing student characteristics. Correlations between school rank and 
student enrollment, percent white students, percent Black students, percent Hispanic students, 
and percent students on free or reduced lunch programs were all non-significant. In addition, the 
ranking statistic was uncorrelated with schools' mean achievement for the previous year, a 
crucial result for establishing the fairness of the procedures. 

These procedures were employed during the 1984-85 school year by the Dallas schools in 
a program to recognize and reward outstanding schools. Under this program, teachers in schools 
ranked in the top quarter of each grade-level category of schools each received a stipend of up to 
$1500. Other employees in those schools also received stipends, the amount determined by 
position and responsibility. 

Effective S chools Methodology . 1992 

School Effects. The basic rationale of the 1992 study is similar to that used in earlier 
studies. The school effectiveness methodology defined a school's effectiveness as being 
associated with exceptional measured performance above or below that which would be expected 
across the entire District. When a school's population of students departs markedly from the 
more general trend of similar students throughout the District, this departure is attributed to 
school effect. The problem of measuring a school's effect, then, becomes one of establishing the 
student levels of accomplishment on the various important outcome variables, setting levels of 
performance based on these expectations, and determining the extent to which its students, on the 
average, exceed or fall short of expectation. The procedures involve regression analysis to 
compute prediction equations by grade level or by school for each outcome variable independent 
of school identification and then using these equations within schools to obtain mean gains over 
expectations. A major feature of this approach also involves assigning relative weights to each 
of the outcomes. Once weighted levels of performance have been determined, the methodology 
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provides an indicator of how well a school performs relative to other schools throughout the 
District. Once again, as in the earlier studies, the student is the unit of analysis. 

The difference between the logic of this approach and the one used in 1984 is subtle but 
substantive. In 1984 individual growth curves were developed for each student based on two 
years of historical test data. Equations were established at the student level with districtwide 
data, then the individual student residuals were applied to the students' home schools and 
aggregated. In the 1992 studies, as many predictor variables as was efficient were used to predict 
individual student achievement and other outcomes. Equations were again established at the 
student level with districtwide data, then the individual student residuals were applied to the 
students' home schools and aggregated. While individual student growth curves were used in 
1984, the 1992 studies computed relationships at a systemwide level and then applied them to the 
schools. In 1984, an effective school was defined as a school that had more than half of its 
students exceeding prediction based on their individual growth curves. In 1992, an effective 
school was defined as a school that had more than half of its students exceeding prediction based 
on the patterns of other like students throughout the District. 

There are a number of differences between the equations developed for the 1984 studies 
and the ones developed for 1992. First, and probably most important, the number and nature of 
outcomes were greatly expanded in 1992. While the 1984 studies used only standardized 
achievement tests, the 1992 studies added 143 separate course related criterion-referenced tests, 
student promotion and graduation rates, student attendance rates, and percentage of students 
taking and average scores on the Scholastic Aptitude Tests (SAT). 

Chap Rgs In The Model . In addition to the expanded number of outcome variables used in 
the equations, a number of other changes were made in the model. First, an attempt was made to 
initially meet the assumptions of the linear model rather than to empirically validate the 
equations after the fact. The major assumptions that had to be met in order for the equations to 
produce adequate prediction were linearity and homoscedasticity. (Normality and independence 
of observations are only important if one is attempting to infer attributes from a sample to a 
population). Nonlinearity would be unacceptable because the equations would not fit the data. 
Non-homoscedasticity would be unacceptable because a student's prior position in the predictor 
distribution would bias the range of the residuals depending on the standard deviation of the 
residuals at a given point in the distribution. Linearity was routinely checked with each equation 
and achieved. Homoscedasticity was achieved by dividing each distribution into 128 arrays and 
normalizing each a ray around the regression line. This technique prevented similar scores from 
different points in the distribution from carrying more weight than like scores at ether points in 
the distribution. In addition, the regression lines representing all combinations of background 
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variables were graphed to study patterns of bias in prediction at different points in the 
distribution. No consistent patterns were found. 

The problem of perceived bias toward schools serving specific populations was addressed 
by using a two-stage process. In the first stage, each predictor and outcome variable was 
regressed on the set of important background variables and their first order interactions. 
Important background variables were ethnicity, gender, limited English proficiency status, and 
free or reduced lunch status. Residuals from these regressions then became the predictor and 
criterion variables for the next level of prediction. This demonstrably addressed practitioners' 
concerns about the impact of background variables on outcomes. 

Once the initial residuals were obtained, a stepwise regression approach was used. As 
many predictors as was necessary to attain adequate prediction were included in the equations. 
As a result of using residuals that accounted for of the student background information and of 
expanding the number of predictor variables, satisfactory prediction was attained without having 
to go back more than one year. (Most multiple r's were above .620 and forty percent exceeded 
.700.) This maintained the degrees of freedom associated with the equations. 

Effectiveness indices were produced for each outcome variable for each grade level by 
each of the combinations of background variables (gender by ethnicity, gender by LEP status, 
gender by economic status, etc.). All indices were standardized to a mean of 50 and a standard 
deviation of 10. This made them easily interpretable and provided a powerful diagnostic tool to 
determine patterns of service to various student groups. 

Two additional aspects of the regression model were examined: the correlation of 
residuals with predicted values and the equality of outcome and predictor residual scores by the 
fairness variables. Correlations of residuals and predicted values would be unacceptable because 
it would imply that a student's prior position in the predictor distribution would bias the student's 
effectiveness score, the residual. Unequal residuals for any fairness variable group would 
suggest biased results within that group. Any of these problems would bias the effectiveness 
index for a school with a preponderance of such students from a given location in the 
distribution, a situation that the entire process was created to remedy. 

Correlations of standardized residuals with predicted values were all near enough to zero 
to be acceptable. The largest was -.042. Again, statistical significance was not examined in 
these tests of aspects of the model because of the large numbers of degrees of freedom. Whether 
or not a result was statistically significant was irrelevant. Practical effect on the results was the 
only relevant criterion. The equality of residual means by fairness variable grouping was 
examined for each predictor and outcome variable. Examination of the means by fairness 
variable before and after the first regression stage showed that large pre-existing differences in 
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group means were equalized by the procedures described earlier. Only small, non-practical 
differences between group means existed after the first regression stage. 

Methodolo g y . In the first regression and prediction phase, each predictor and outcome 
variable was regressed on a combined ethnicity/language proficiency variable, gender, and 
free/reduced lunch status and the first order interactions of these variables. 

For computing the individual difference scores (residuals) in the first phase, lef 

mgi ~ the outcome variable of interest for each individual i at grade level g 
on measure m 

X lmgi = Black status CI if Black, 0 otherwise) 

X 2mgi = His P anic English Proficient status (1 if English Proficient Hispanic, 0 
otherwise) 

X 3mgi = His P anic LEP status (1 if Limited English Proficient Hispanic, 0 
otherwise) 

X 4mgi = S ender status ( 1 if male, 0 otherwise) 

X 5mgi = ftee or reduced lunch status ( 1 if subsidized lunch, 0 otherwise) 
Then the expected 1992 level on each measure was given by the linear function 

Y mgi = tl (X lmgi< X 2mgi. X ^mgi, X 4mgi, X5 mgi, X l m gi X 4mgi, 
X 2mgi X 4mgi> X 3mgi X 4mgi, X lmgi X 5mgi' X 2mgi X 5mgi- 
X 3mgi X 5mgi< X 4mgi X 5mgi^ 
Where the function f t , was the prediction equation corresponding to the appropriate grade 
level, measure.^and outcome (t) classification. The equations appeared as follows: 

Ymgi = b + b,X + b X + b 3 X 3mgi + b 4 X 4mgi + b 5 X 5mgi + 

b 6 X lm gl X 4mgi + b 7 X 2mgi + b 8 X 3mgi X 4mgi + b 9 X lmgi X 5mgi + 
b 10 X 2mgi X 5mgi + b l l X 3mgi X 5mgi + b 12 X 4mgi X 5mgi 

Where: 

A 

Y mgi = Predicted outcome or predictor variable 
t> 0 = the constant 

bj = Black student status 

b 2 = Hispanic English proficient status (HEP) 

= Hispanic Limited-English proficient status (HLEP) 

b4= gender status 

bj = free/reduced lunch status (FRL) 
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bg = Black/gender interaction status 

by = HEP/gender interaction status 

bg = HLEP/gender interaction status 

b^ = Black/FRL interaction status 

bjQ = HEP/FRL interaction status 

b j j = HLEP/FRL interaction status 

= gender/FRL interaction status 

It should be noted that since a stepwise regression approach was used, the status variables 
were entered in different orders depending on the relationships between and among variables for 
each outcome and each predictor variable at each grade level In not all cases were all 
background variables and interactions significant. It should be further noted that other students 
(non-black, non-HLEP, non-HEP) were not explicitly included in the first-stage equations since 
they formed the referent against the other ethnic/language students to avoid singularity of the 
regression design matrix. 

A 

From these equations, a predicted score, Y mg i, was computed for every student on each 
outcome variable at each grade. Also, predicted scores X lmgi , X 2mgi , ^X nmgi , were 
computed for the n predictor variables for each student and grade. 

Individual difference scores were then computed as 

d mgi = Y mgi " ^mgi for criterion scores and 

X mgi * *mgi tbr P redictor scores « 

The multiple correlations between fairness variables and student-level outcome and 
predictor variables ranged from a low of .070 for student attendance at grade 12 to a high of .488 
for NAPT Reading at grade 10. Coefficients were generally above .425 for ITBS/NAPT Reading, 
Vocabulary and Language tests and above .35 for the Mathematics test. Number of students 
ranged from a low of 3,705 for the correlation with the criterion-referenced tests at grade 12 to a 
high of 10,452 with student attendance at grade 1. 

In the second phase of the study, the outcome residuals (d m gi) computed during Phase 1 
were regressed on the residualized predictor variables. Thus residuals of the predictor variables 
were used to predict residuals of the outcome variables. Once again, a stepwise regression 
approach was used. In all cases, the prior level of the outcome variable was the most significant 
predictor of the outcome variable. The equations appeared as follows: 

Y rmgi = b o +b l X rlmgi - +b n X rnmgi 
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Where: 

A 

Y rmgi = Predicted outcome variables (residuals) tor each individual i at 
grade level g on measure m. All Y f 's and X f ,s in this phase were 
residuals computed in the First Regression and Prediction Phase. 

b Q = constant 

bj = first predictor variable into the equation (residuals) 
b n = last predictor variable into the equation (residuals) 
In the prediction of residual outcome variables from residual predictor variables, multiple 

E's ranged from .367 for grade 1 language to .818 for grade 8 language (with the exception of 

grade 1 attendance where kindergarten attendance was not available for a predictor and the 

multiple i was .156.) Most multiple r's were above .620 and over 40 percent exceeded .700. 

Degrees of freedom ranged from 3.596 for grade 1 1 language to 9,479 for grade 3 attendance. 

Most elementary degrees of freedom exceeded 7.000, most middle school exceeded 6,200, and 

most high school exceeded 4,000. 

The next step of the second stage was to compute residuals (d rm gi) from the regression 

equations. Raw residuals were computed using the following equation: 

d . = Y - - Y 
rmgi rmgi rmgi 

After raw residuals were computed, the predictor space was divided into 128 equal 
intervals. For each interval p, the mean and standard deviation of the raw residuals was 
computed. Each raw residual was then standardized by subtracting the mean for the interval and 
dividing by the standard deviation of the interval. Expressed as an equation, the standardized 
residuals were: 

d prmgi = (d prmgi * d prmg )/sd prmg 

Where: 

d prmgi = the standardized residual for individual i in interval p at grade 
g on measure m 

d prmg = the mean residual in interval p at grade g on measure m 

sd prmgi = tne sta ndard deviation of the residuals in interval p at grade g 
on measure m 

The last part of the second phase was to determine the residuals for the school level 
variables. These were promotion rate at the elementary level and graduation rate and an SAT 
achievement/participation variable at the high school level. For this part of the regression 
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analysis, the critical issue was degrees of freedom. Since there were a relatively large number of 
measures available and the school was the unit of analysis, the number of schools was relatively 
small and the models were easily overspecified if too many variables were entered. Further, it 
was impossible to tell where overspecifxation became a significant factor after the first variable 
was entered in the equation. Therefore, only one predictor variable was used for each school- 
level equation. School residuals were found by subtracting the predicted value of each school 
level variable from the actual value using simple regression equations. 

The school ranking phase began at this point. The student-level part of the process will 
be described first. To simplify the notation from this point forward the following substitution 
will be made: 

r smgi = the stan dardized residual for individual i at grade, g in school s on 
measure m 

Mean residuals for each school on each measure at each grade were computed. In order 
to obtain rankings reflecting weighting of variables by sample size, the mean residuals were 
standardized by subtracting the district mean residual (approximately 0 after the standardization 
within intervals), dividing by the district standard deviation of residuals (approximately 1), and 
multiplying by the square root of the n for each school on measure m at grade g. Expressing this 
as an equation: 

M smg = « r smg " V /Sd mg> * ^g 

Where: 

M smg = the mean devia tion of the mean residual for school s at grade g 

on measure m (expressed in standard errors of the mean) 
r smg = tne mean standardized residual at grade g in school s on 

measure m 

r mg = the di stnct mean standardized residual at grade g on measure m 
sd mg = the district standard deviation of the standardized residuals at 

grade g on measure m 
n smg = tne number of students at grade g in school s on measure m 
The mean deviations M smg were then ranked to produce the ranking of the schools on 

measure m at grade g. To combine mean deviations across rr asures to determine each schools 
effectiveness index the mean and standard deviation of the M smg were computed for each m and 
g and the distribution of mean deviations standardized and expressed as a T-score. As an 
equation this is expressed: 

T smg = ((M smg- M mg) /sd Mmg^ 10 + 50 
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T smg = the standardized mean deviation for school s at grade g on 
measure m (expressed as a T-score) 

M smg = lhe mean devialion a t g' ,ad e g in school s on measure m 

M mg = the district mean of mean deviations at grade g on measure m 

sd Mmg = the district standard deviation of mean deviations at grade g on 
measure m 

Recall that school-level variables were ranked on simple deviations from a one-predictor 
regression. These deviations were expressed as T-scores using a process identical to the previous 
one for the residuals from the single variable regression. 

To obtain the school effectiveness index, these standardized mean deviations were 
multiplied by the weight for the measure at the given grade (or, for school-level variables for the 
given school), summed and divided by the sum of the weights to obtain the school index. This is 
expressed in the following equation- 

V <n T smg w mg + 1 t Ss w s )/(zz W + IW S ) 

mg S mg & S 

Where: 

T s = the effectiveness index for school s 

T smg = the standardized mean deviation at grade g in school s on 
measure m 

W mg = tne w eight for measure m at grade g 

T Ss = tne standardized mean deviation for school-wide variable S in 
school s 

W s = the weight for school level variable S 
This procedure produced an easily interpretable T statistic for each school. 

Since accountability indices without information for diagnosis and improvement are of 
limited utility, the system of equations which generate the school effectiveness indices also 
generate a great deal of information designed to help the schools diagnose areas of weaknesses. 
Schools receive rankings, also expressed r T scores, on each variable by grade as well as 
corollary output of the contributions of each student subgroup to the rankings by measure and 
grade. These outputs are used u the campus level to obtain information about campus 



ERIC 3 



Effectiveness Indices: Major Component 
Page 18 



effectiveness and needed areas of improvement. When used in conjunction with skills analyses, 
these statistics become powerful diagnostic tools. 

RESULTS 

Face Validity - Table 1 displays some of the demographic characteristics of the top 
twenty percent of schools, as defined by the 1992 methodology. The reader will note that 
effective schools, as defined by this methodology, come in all sizes and shapes. District statistics 
at the particular grade levels are also presented to provide the reader with a framework for 
interpretation of the information. 

At the K-6 level, the most effective schools tended to have smaller enrollments than the 
average enrollment of District elementary schools. Enrollments ranged from a low of 193 to a 
high of 860. Ethnicities ranged from a high of 99.7% Black. 90.4% Hispanic, and 64.5% White 
to a low of 3.5% Black, .3% Hispanic, and 0% White. Most deprivation indices were above the 
District average of 69, ranging as high as 92, while the percentage of limited English proficient 
students ranged from a high of 57.9% to 0. In short, whether or not a school was ranked among 
the most effective could not be predicted from the demographics of the students that it served. 

The last column in Table 1 (SR) depicts the school rank on the percent of students 
passing all subtests of the Texas Assessment of Academic Skills (TAAS), the most recently 
reported State test. The top twenty-seven K-6 schools in the District on the effectiveness indices 
had composite ranks between 3 and 107 when ranked based on absolute achievement levels. It 
should be noted that the six schools that ranked in the top fifteen in the District on the TAAS, 
when no known non-school sources of variation were accounted for. were at least 50% White 
and had no deprivation index above 40. 

At the 7-8 level, the most effective schools had enrollments varying from 367 to 888 and 
were from 7.0% to 30.3% White. 16.3 to 75.4% Black, and 15.7% to 63.2% Hispanic. 
Deprivation indices varied from a low of 30 to a high of 85 and percent limited English 
proficient varied from 0 to 31.2%. Ranks on the effectiveness indices and the TAAS were closer 
at this level than at K-6. Part of the reason for this is that District middle schools do not differ as 
much demographically as do District elementary schools. 

At the high school level, magnet schools dominated the rankings. Four of the top six 
schools were magnet schools. This finding was predictable since at this level magnet schools 
spend about twice as much per student as do comprehensive high schools. Notice, however, that 
the most effective high school in the District, a school that is 97.9% Black, and had a deprivation 
index of 32. was only ranked 16th out of 28 high schools on the TAAS. 
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The schools identified by this process fit the perceptions of most practitioners of what 
constitutes effective schools. Thus, the process provided results that had a great deal of face 
validity. 

Consistency of Results . To examine consistency of results across years. 1983-85 effectiveness 
data were used. Despite the fact that the old effectiveness indices were based entirely on 
individual student growth curves on standardized achievement tests, and it was hypothesized that 
top schools could not maintain their standing because each time they finished high their 
individual student growth curves were accelerated, the results were extremely consistent. 
Effective schools tended to remain effective while ineffective schools tended to remain 
ineffective. Visual scans of the results from the 1983-85 studies and the 1992 study, using two 
very different models, suggest some amazing consistency between those schools identified as 
effective or ineffective in 1992 versus those schools identified as effective or ineffective in 1983- 
85. These same scans of the 1983-85 data also suggested that Principal changes had some 
impact on schools improving or failing to improve. Since numerous Principal changes occurred 
between 1985 and 1992. empirical verification of similarities in rank was not pursued until it 
could be determined whether or not principal changes had a major impact on results. 

To examine school effectiveness after a change in school principal. 1983-84 and 1984-85 
data were used. Regression equations were computed for grade K-3 and grade K-6 school 
configurations for both school years. (Analyses at other scnool configurations were not possible 
because of insufficient data.) In each of the four regressions, the assumption was made that a 
simple linear model could describe the existing data structure, regardless of whether or not the 
schools had experienced changes in principals. This assumption was tested using tests of 
homogeneity of regression. Specifically, the data structure for each of the four data sets was 
represented using two lines, one for schools with changes in principals and one for schools with 
no changes in principals. In effect, with the more complete model, no assumptions were made 
regarding equality of intercepts or slopes. The question was then whether the more complete 
model provided more predictive power than the reduced model (i.e.. the single regression line 
model). In statistical terms, the question was whether the two intercepts and two slopes in the 
more complete model were equal. If they were unequal, differences in school effectiveness were 
noted after changes in principals; if they were equal, differences in school effectiveness were not 
noted after changes in principals. 

Of the four tests of homogeneity of regression, three were significant. This suggests that 
differences in school effectiveness occur after changes in principals and makes any analysis of 
1983-85 results versus 1992 results for consistency inappropriate because of the large number of 
principal changes that occurred during that time frame. 
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Adjustments To The Model - Aiken and West (1991) present the argument that when all 
predictor variables are entered into the same equation, the slope of the regression line on the 
criterion variable for each value of each of the predictor variables is uniquely computed. This 
eliminates the necessity of assuming parallel regression lines for the various predictor variables. 
In empirical checks of the parallelism of the regression lines in the current model, most were 
found to be parallel. One exception was the Hispanic Limited English Proficient group at the 
seventh grade level. Simulations are currently being run to determine the differences, if any, 
produced by the two models. (That is the two stage process currently used versus the fully 
specified model.) If the two models produce dissimilar results, the Accountability Task Force 
must weigh the relative advantage of widespread political acceptance of the current model versus 
a possible slight advantage in accuracy of the fully specified model. It is expected that most 
results will be very similar. 



DISCUSSION 

For 1992-93 the outcomes used for the effectiveness indices include a nationally normed 
standardized test (/77?S, Grades 1-2; NAPT, grades 3-11), a State-mandated criterion-referenced 
test which includes a writing sample (TAAS, grades 4, 8, 10), 143 separate course-related 
criterion-referenced tests {ACP, grades 7-12), student promotion rate (grades K-8), student 
graduation rate (grades 9-12), student attendance rate (grades K-12), and percentage of students 
taking the SAT and average scores on the SA T (grades 1 1 and 12). 

For 1993-94, six more outcome variables will be added to the equations. These include 
dropout rate (grades 7- 1 2), student enrollment in accelerated courses with associated ACP scores 
(grades 7-12), high school enrollment in advanced diploma plans (grades 9-12), post-graduate 
enrollment in college or business schools, percent tested and average scores on the Preliminary 
Scholastic Aptitude Tests (PSAT, grade 10), and post-graduate pursuits. 

A significant change will occur in 1993-94 when about twenty-five percent of the ACP's 
will have components which include performance tests. Performance items and detailed scoring 
protocols will be provided to the schools. Samples will be drawn and selected tests rescored. 
Performance test results will then be adjusted by their reliability and included as outcomes in the 
equations. Preliminary analyses of the results of performance tests suggest that they are much 
more difficult than the average norm-referenced test over the same material (Dryden, 1991). 

The effectiveness indices are an important part of the three-tier system of accountability 
being implemented in the DISD (Webster and Edwards, 1993). Training modules for school 
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staffs in keeping and scoring portfolios of student work, designing and scoring performance tests, 
conducting protocol analysis, developing teacher-made tests, interpreting and using data, and 
designing and conducting appropriate action research are being undertaken to broaden the 
information that is currently available on student performance. Great emphasis is being placed 
on providing data for diagnosis and improvement. Figure 1 .0 displays the range of data currendy 
available to the schools. Indicators that are collected centrally and provided to the schools are 
specified with an "E", Indicators that school staff are being trained to collect and maintain 
themselves are specified with a "C". State academic excellence indicators are asterisked while 
variables that are or will become outcome variables in the effectiveness indices are marked with 
a#. 

The effectiveness indices include a wide-range of variables, chosen and weighted by an 
Accountability Task Force which represents a composite of district groups vitally interested in 
education, and adjusted for inputs that are not under the control of the schools. Schools derive 
no particular advantage by starting with high-scoring or low-scoring students of any particular 
ethnic or economic group, are only held accountable for the outcome levels of cohorts of their 
continuously enrolled students, and are held accountable for a broad array of important education 
outcomes in addition to standardized test scores. The effectiveness indices are designed to foster 
teamwork among school staffs within schools in that school staffs must, in order to achieve the 
necessary improvements, work together in a coordinated effort. For this reason, the program 
does not reward individual competition among teachers. 

Dallas Independent School District schools and their staffs were eligible for cash awards 
for 1991-92 performance based on the school effectiveness methodology under the District's 
School Performance Improvement Awards Program. In September of 1992, 2.4 million dollars 
was distributed to effective schools and their employees. Half of the 2.4 million dollars was 
budgeted by the District, the other half came from the community. To qualify schools had to 
exceed prediction on the effectiveness indices, test 95% of their eligible students, and outgain the 
national norm group in at least fifty percent of their cohorts. Once a school was selected as an 
award winner, the school received $2(K)0 for its activity fund, each member of its professional 
staff received $1000, and each member of its support staff received $500. This program is 
continuing in 1992-93. Appendix A contains the details of the program. 
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Figure 1,0 Formative and Summative Indicators Available to DISD Schools 
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1,2,4 


Winter (grades 4-12) 


Student Climate Survey, Grades 4-12 
(E) 


4 
4 


Provided on request by EPS 
Provided on request by EPS 


Sociograms ot Informal Interaction 
(lunch, recess, faculty meetings, etc.) 
(C) 


4 


Local Option 
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Table 1.0, Continued 



oviiuui-v^uilllliuiui y iwUUrlLrU jUlVcy 

(E) 


4 


Fall and Spring 


Assistance 'And ( ^nnsnlMMnn TY»»im 

(ACT) Surveys (global issues, case 
management, training on mental 
health principles) (E) 


4 


Fall and Spring 


Measures ot Mobility and Stability 
(E) 


J 


C, 11 " 

rail 


Percent Eligible Tested versus 
Average Daily Attendance (E) 


5 


Fall 


Monitoring of Local School 
Accreditation Remedies (C) 


6 


rail 


Monitoring ot Implementation ol' 
Local School Programs (C) 


7 


uucui \jpuon 


Monitoring ot Instructional Delivery 
(C) 


12 4 6 7 


LyUcdi ueierminauon 


Student Retention Kate (£) 


7 


Frill # ^ 

I till Tf 


* Student Enrollment in Advanced 
Courses (E,C) 


8 


Fnll (nrino M 
run, opiing fr 


* Student Enrollment in Honors (E,C) 


h 


Fall Snrinp # 


* Student Enrollment in Diploma 
Plans (E,C) 


8 


Fall, Spring # 


Survey ot Student Course Interest 
(Grades 7-12) (E) 


8,9 


Provided on request by EPS 


* Dropout Rate (E) 


9 


December # 


* Graduation Rate (fc) 


9 


Fall # 


* SAT/ALT Participation Rates (E.C) 


10 


Fall # 


*!>A T/ACT Scores (E) 


10 


Fall# 




10 


Provided by the State 


Uraduate Follow-Up (E) 


19 


Fall # 


Student Post-Graduate Pursuits (E) 


8, 9, 10 


Fall # 


PSAT Participation Rates (E) 


10 


Fall # 


PSAT Scores (E) 


10 


Fall # 



* An Academic Excellence Indicator 

I* i T £ AS ' S * c Texas Assessment <>f Academic Skills, a State-administered criterion-referenced test. ITBS is 

0f ??? C Sk " lS - NAPT ls U,e ^-referenced Assessment Program for Texas, a Texas version of the 
nut. ALFsare 143 criterion-referenced course exam, grades 7-12. 
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SCHOOL PERFORMANCE IMPROVEMENT AWARDS 

1992-93 



One of the key ingredients of the Commission for Educational Excellence's 
recommendations was an awards plan for effective schools. For 1992-93, the Dallas 
Independent School District (DISD) has budgeted 1.5 million dollars for this system The 
community will raise $900,000, making a total availability of 2.4 million dollars' The 
selection procedure for determining which schools win is completely objective and is 
designed to award schools and school staffs that show the most improvement on 
important outcomes of schooling. 

1.0 Outcome Varia^jffi 

For the 1992-93 school year, awards will be based on school performance on the 
following variables: 

1 . 1 Elementary Schools 

1.1.1 Student scores on the Reading, Language, Vocabulary (at grade 
levels tested), and Mathematics subtests of the Norm-refexenrari 
Assessment EmgiamfailexjsiNAEO. The NAPT is the State 
replacement for the Iowa Tests of Basir fiyfflfj (UBS) and Tests of 
Achievement and Proficiflncy (IAE). Grades K-2 will be tested 
with the OS. 

1.1.2 Promotion Rate (percentage of students promoted, summer school 
doesn't count). 

1.1.3 Student Attendance 

1.1.4 Student scores on the Texas Assessment Of Academic Skill. 
( l AAS , ) , Grade 4, Reading, Writing, and Mathematics subtests. 

1.1.5 A special test will be developed or purchased to test Spanish- 
dominant Limited English Proficient students. This test will be 
administered to Spanish-dominant Limited English Proficient 

It z£l students who ^ ineligible to be tested with the UBS or 
NAPT. 

1.2 Middle Schools 

1.2.1 Student scores on, the Reading, Language, and Mathematics 
subtests of the NAPT . 

1.2.2 Promotion Rate (percentage of students promoted, summer school 
doesn t count). 

1.2.3 Student Attendance 
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1.2.4 First and second semester student Assessment o f Course 
Performance (ACPJ scores in mathematics (Math 7, Math 8, Math 
7 Pre-Honors, Algebra I, Pre-Algebra 8); language arts (Language 
Arts 7, Language Arts 8 t English 1); social studies (Texas 
History/Geography 7, U.S.. History 8); and science (Life Science 
7, Earth Science 8, Pre-Honors Earth Science). 

1.2.5 Fu-st and second semester student ACE scores in ESOL I , n, and 

1.2.6 First and second semester student ACP scores in Reading 
Improvement, Reading 7, and Reading 8. 

1 .2.7 Student scores on Texas Assessment of Anartemfr Skills (IAASJ, 
Grade 8, Reading, Writing, and Mathematics subtests. 

1.3 High Schools 

1.3.. 1 Student scores on the Reading, Written Expression, and 
Mathematics subtests of the NAPT. 

1.3.2 Student Graduation Rate (the percent of students who graduate by 
the Spring semester five years after they enrolled in the ninth 
grade). 

1.3.3 Percentage of seniors who have ever taken the Scholastic Aptitude 
l£5l (SATVAmerican Toll^ T^f (ACT) 1 

1 .3.4 SAI /A£I Achievement (juniors and seniors, highest score). 

1 .3.5 Student Attendance 

1.3.6 First and second semester student ACE scores in mathematics 
(Algebra, Algebra II, Algebra II Pre-Honors, Geometry Pre- 
Honors and Geometry); language arts (English I, II, m, IV); social 
studies (U.S. Government, World History, World Geography, U S 
History); science (Physical Science, Applied Biology, Biology i' 
Chemistry, Physics); and World Languages (Spanish I, n, French 
UI, German I). 

1.3.7 Student scores on the IAAS, Grade 10, Reading, Writing, and 
Mathematics subtests. 

1 .3.8 Student ACP scores in ESOL I, II, m, IV. 

1 .3.9 First and second semester ACP scores in Reading Improvement. 

2.0 Qualifying Srhnnl^ 

All schools that have the necessary outcome data and all students will be included 
in the outcome equations. However, in order to be eligible for a School 
Performance Improvement Award all schools must: 
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2.1 Test at least 95% of their eligible continuously enrolled students SF 35 
increase their i percent^ eligible continuously enrolled students tested by 3% 
over Spring, 1992. These statistics refer to percent tested on the NAPT 
SlS A*if ? G J l00 l Connnimitv n,i{rf, nrr Q ntn - (SCGQlSll'be 
tested and attributed to their home schools. 

22 lh£lA&t ^ PerC6nt aVerage daily attendance for S rades 4 - 8, or 10 on 

2.3 Exceed the national norm group growth curves, or be above the national 
norm group, in at least 50% of school cohorts on the NAPT 

dMaS^lttFvS!" 6aCh r° f the ^mentioned criteria, it will not be 
eligible for a School Performance Improvement Award. 

3 0 Establishing School Tn^ i-ft 

Since the School Performance Improvement Award is based entirely on student 
outcomes once a school has qualified) it is important to specify wSch studSS 
will be included in the various cohorts. Therefore: "uaents 

3.1 Establishing School Cohor ts 
All students who: 

3. 1 . 1 are enrolled continuously in a specific school from the end of the 
first six weeks, and 

3.1.2 have the necessary pre-observation data in the DISD and Dost- 
SoTand 1 ^ 199 1-92 SCh ° 01 year in Aat s P ecific 

3.1.3 are eligible for the testing program according to the DISD 
Systemwide Testing Policy (on the testing variables) 

will be included in the cohort longitudinal analysis. Thus, in order to be 
included as a member of a given school's cohort, a T student Tust be 
enrolled in that school by the end of the first six week, have the nSessar? 
pre-observauon data, and be tested in that school in accordance vSSdISD 
f^Ht^S^fS systemwide testing program. Students who Ser out 
of a school and back into that school over a short period of time will be 
included in that school's cohort. Schools that, in the opinS n of the 
Accountability Task Force, attempt to manipulate their continuouslv 
enroUed student population will be disqualified from the AwaS ftogrS 

40 OualifvinfS^ fTfnrAwflr^ 

Once a school has been empirically selected for a School Pprfnrman™ 
Improvement Award, the schoofwill receive $2000 toVspem n?a male^ome? 
tiian compensation, to be determined by the School Community cS ( SCO 
Committee ,n School Centered Education (SCE) schools or F^ultvS 
Adv 1S ory Committee in Non-SCE schools. Perforce awards wnfaCbe 
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Pa6e 36 distributed in the form of compensation to the staff of winning schools based on 
the following criteria. 

4.1 Eligible Staff 

4. 1 . 1 Principals will be eligible to receive a stipend. 

4.1.2 All campus personnel will be eligible to receive a stipend if they 
are full-time professional or support personnel who are assigned to 
a single campus and are evaluated by a local campus administrator. 

4.1.3 Professional or support personnel who are assigned to more than 
one campus and evaluated by one or more campus administrator(s) 
will receive a pro rata share of the stipend. Proration will be based 
on the percentage of time assigned to one or more winning schools. 

4. 1 .4 In circumstances where there are variable hours worked within an 
employee classification the employee will receive a pro rata share 
based on the percentage they work of the standard work day of 
their respective classification. 

4.2 Successful Evaluation 

Individuals must be evaluated "Meets Expectations" or above in order to 
participate in the monetary awards. 

4.3 Stipends 

4.3.1 Professional Staff 

Stipends will be paid to professional staff who are assigned to 
ginning schools. The amount of the stipend will be determined by 
the considerations specified in Section 4.1 and by attendance 
during the contract year: 

4.3.1.1 Attendance 

Eligible professional staff who are present all contract 
days of the school year and meet requirements 4.1 1 or 
4.1.2 will receive a stipend of £LQQQ. Professional staff 
who are not present all contract days will receive an 
award of one thousand dollars minus five dollars per day 
for every contract day absent If professional staff are 
not full-time at a winning school, their share will be 
calculated in the manner specified in 4.1.3 or 4.1.4. 

4.3.2 Support Staff 

Stipends will be paid to support staff who are assigned to winning 
schools. The amount of the stipend will be determined by the 
considerations specified in Section 4.1 and by attendance during 
the contract year. 6 

4.3.2.1 Attendance 
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Eligible support staff who are present all contract days 3 ' 8 ' 3 
of the school year and meet requirements 4.1.1 or 4 1 2 
will receive a stipend of m Support staff who are not 
present all contract days will receive an award of five 
hundred dollars minus $2.50 per day for every contract 
day absent. If support staff are not full-time at a winning 
school, their share will be calculated in the manner 
specified in 4. 1 .3 or 4. 1 .4. 



Number of Winning Sf hftftlfi 

Sf~ Un? i^° f Winn ? ng *i h00ls wil1 de P end on to siz e of the schools that win 
rersonnl ^ ^i" 8 P rofc ^°nal and 800 winn ng support 

personnel. The determining factor will be the number of staff associated with 
winning schools that can be awarded stipends of up to $ Soc $500 *for 
5S£f ^ SUPP °? P er l? nne1 ' rcs P ec ^y. within me avZbr 2 .4 rnHlion 
Sie fw ar ?f " {5* """J^ lar * e «*ools win, fewer schools will be tadSedta 

sniWisi,^ number of smau schoois ™> m ° re 

Establishing Apprnpr fote rnmparj fflnf} 

frhlSf V^ all0W dl ? h0 ° I C0nfi 8 ura ^ns a reasonable chance of receiving a 
School Performance Improvement Award, District schools will be chofen 
according to the following categories: scnoois win oe chosen 

6-1 Categoric ff>rrnmp ar j snn 

Gradelj»vf») 
6-1 PK-3 
£*| 4-6 and Vanguards 

£•3 PK-6 and Vanguards 

7-8 and Academies 
°- 5 9- 12 and Magnets 

The amount of money available for each level will be determined bv the 
percentage of school-based professional personnel employed™ efch level 

6 - 2 Magnets. Vanguards and, A raf j ffm iffn 

Magnets, Vanguards, and Academies will be treated as separate oroerams 
!%t a PP5 0 P n ? te le u vel ^.^ey have separate teaching SSSSlSSSSi 
staffs. Otherwise, they will be included with the appropriate school 1 The 
following academies and vanguards, located in me sanSe buU^g with a 
comprehensive school, will be treated as separate schools: 8 

Holmes Academy 
Spence Academy 
Lanier Vanguard 
Polk Vanguard 
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Pag ' 38 6.3 Schools Nnt Meeting Stand ard Criteria 

Several schools have insufficient data on one or more critical variables 
included in the school effectiveness indices and therefore cannot be 
included in the Award Program. These schools are not included in the 
regular process due to the nature of the school or the student enrollment at 
the school. In either case, school effects cannot be computed using the 
procedures proposed for the school effectiveness indices. The schools 
which are not yet included in the process for 1992-93 are: 

Health Special 

E.D. Walker Special Education Center 
Multiple Career Cenwr 
Alternative Academic Cooperative Center 
Evening Schools (Skyline and Kimball) 
Metropolitan Education Center 
School Community Guidance Center 
Letot Academy 
Brashear 

Quentin D. Corley Academy 
Edison Work Activity Center 
Science Magnet 

An Ad Hoc Committee of the Accountability Task Force, chaired by Dr 
Herman Saettier has been appointed to work with these schools in 
producing an appropriate plan for the 1992-93 school year. 

6.4 Employees Not Mating Standard Trif f rj a 

Classifications of employees who are, because of budgetary or supervisory 
criteria, excluded from participation in this program are invited to submit 
ideas and/or proposals that might achieve the same goals for their 
respective groups. These proposals should be submitted to Robby Collins 
Executive Manager, Governmental/Internal Relations, 3700 Ross Avenue! 
Box y. All proposals will be considered by the Accountability Task Force 
for possible implementation. 



7.0 The Equating 



The school effectiveness methodology defines a school's effectiveness as being 
associated with exceptional measured performance above or below that which 

7£aI . . / Xp !? ed a , Cr ^ S entirc District - When a school's population of 
a?™? , de Pj^ f ma *edly from its own pre-established trend or from the more 

fSnhmJ T?n twi in, iJ ar , St ^f nts * ro «« hout *e District, this departure is 
attributed to school effect The problem of measuring a school's effect, then 
becomes one of establishing the student levels of accomplishment on the various 

eTiKon^l'nTd 6 * anab ! es ' £ ettin g level * of performance based on these 
222 ? ? ii *? d de ? rmuiul g.the extent to which its students, on the average, 
exceed or fall short of expectation. The procedures invoh - regression analysis to 
£555 . pr i dlctl ° n ^ons by grade level or by school for each outeome 
variable independent of school identification and then using these equations 
!!!?« n ? ?° • 10 ^ e * n S*" 15 over expectations. A major feature of this 
KSS f lso , inv ? lve 3 signing relative weights to each of the outcomes. Once 
weighted levels of performance have been determined, the methodology provides 
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7.1 



7.2 



7.3 
7.4 



, . effectiveness Ind 

& SS r ? f h °* * d l a ^h 0 . 01 .Performs relative to other schools throughout 
the District Important characteristics of the methodology include: uu S nour 

Schools are only held accountable for the outcome levels of students who 
have been exposed to that school's instructional program. That is, schools 
are only held accountable for their continuously enrolled students 

X*i!L% nC tf ° f ^P 0 * 31 " background variables of students, over which 
the schools have no control, are eliminated from the equations Tha is 
each predictor and outcome variable is regressed on thelet of background 
vanables (ethnicity, gender, limited English proficiency smtus. and f S or 
h^nr e ,i^ nCh !? tU . S) ^ rcsi(Iuals fr ° ra regression nen become 
^SSSfSV^ ° n r^ va P al ? les for *» next of prediction This 
imnac S fl ? d ™> atos «* practitioners' concerns about the 

mpact of background vanables on outcomes. Other fairness variables 
mc udmg, but not limited to, size of school, overcrowding conditions eto 
will be examined for inclusion. e vunumons, eic., 

The outcome variables are weighted by the Accountability Task Force. 

%^^Jn.*XT* ? y Starting - ^ "Coring or low-scoring 
I 2 . u a 1S \ ^ «l uatl ons set individual expectations for each 
student based on that student's placement on the pretest s^f interest 
Lower scoring students have lower predicted scofes. Higher scorine 
students have higher predicted scores. 8 g 

7.5 Only one year of historical data are used. That is a stepwise regression 
approach is used on the residuals of multiple piedfcSiJ^B'SS 
era sausfactory prediction is achieved widiout having to go back Sore 
ttan i one year. This maintains the degrees of freedom associated 1 with die 
equauons since, ,n an urban district, each additional year of daS used 
significantly reduces the degrees of freedom associated witn the equation! 

8 0 Outcome VariahUg Fn r Igg^ 

Slided^o? ioSm k?"^ v J u i a * le8 *** because of timeliness, are not 
included for 1992-93 but will be included for 1993-94. These include: 

8.1 D ro p 0 u t rate for middle and high schools. Because of the time that 
dropout rate becomes available, this will be a time-lag design w?th 1992 
93 dropout rate being included in the 1993-94 equations 

8 ' 2 SSSre^ 2^ ^ f0r middle Md 

8.3 High school student enrollment in Advanced Diploma Plans. 

8.4 Post-graduate enrollment in college or business schools. This will also be 
tem&t^** 1 ' gradUate f0ll0W - U P 0fthe 1992 "93 Qass 25 fa 

8 ' 5 rc e hTev n emen? d *** Prelimi ™ rv Scholastic Aptitnrlf Tnt (ESAJJ 
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8.6 Post-graduate career pursuits. 

8.7 The weight for Graduation Rate for high schools will increase by one 
point per year. 



9.0 Weight* Of Outcome Variahlpg 

For the 1992-93 school year, outcome variables will have the following weights: 
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